Data Generation: Incompressible Navier Stokes

Author

Jayjay, Tuna, Jason, Richard

Published

September 30, 2024

In this document, we cover dataset and FIM generation for incompressible Navier Stokes.

Incompressible Naiver Stokes

  • We follow dataset generation scheme from Physics-Informed Neural Operator.
  • For the purpose of validation, we currently form full Fisher Infromation Matrix and then compute eigenvector.
  • Our next step will be low rank approximation or trace estimation so that we don’t have to form the full matrix.

Dataset

Our dataset consists of \(2000\) pairs of \(\{K, S^t(K)\}_{t=1}^8\).

(a) K0
(b) K1
Figure 1: Example Permeability Model
(a) Time Series of Saturation of K0
(b) Time Series of Saturation of K1
Figure 2: Example Saturation Time Series

Fisher Information Matrix

  • To find the optimal number of observations, \(M\), we visualize eigenvector and vector jacobian product.
  • We observe that as \(M\) increases, the clearer we see the boundary of the permeabiltiy, which will be more informative during training and inference. 1
  • Given 1 pair of dataset, \(\{K, S^t(K)\}^8_{t=1}\), we get a single FIM.

Computing Fisher Information Matrix for each datapoint

We consider a realistic scenario when we only have access to samples, but not distribution. When \(N\) is number of samples and \(X \in \mathbb{R}^{d \times d}\), neural network model \(F_{nn}\) learns mapping from \(X_i \rightarrow Y_i\). For each pair of \(\left\{X_i, Y_i \right\}^N_{i=1}\), we generate \(\left\{FIM_i\right\}_{i=1}^{N}\).

  • \(N\) : number of data points, \(\left\{X_i, Y_i \right\}\)
  • \(M\) : number of observation, \(Y\)

\[ \left\{ X_i \right\}^N_{i=1} \sim p_X(X), \: \epsilon \sim \mathcal{N}(0, \Sigma), \: \Sigma = I \] For a single data pair, we generate multiple observations. \[Y_{i, J} = F(X_i) + \epsilon_{i, J}, \quad where \left\{ \epsilon_{i,J}\right\}^{N,M}_{i,J= 1,1}\] As we assumed Gaussian, we define likelihood as following. \[p(Y_{i,J}|X_i) = e^{-\frac{1}{2}\|Y_{i,J}-F(X_i)\|^2_2}\] \[log \: p(Y_{i,J}|X_i) \approx \frac{1}{\Sigma}\|Y_{i,J}-F(X_i)\|^2_2\] A FIM for a single data pair \(i\) is: \[FIM_i = \mathbb{E}_{Y_{i, \{J\}^m_{i=1}} \sim p(Y_{i,J}|X_i)} \left[ \left(\nabla log \: p(Y_{i,J}|X_i)\right)\left(\nabla log \: p(Y_{i,J}|X_i)\right)^T\right]\]

How does FIM change as number of observation increases?

FIM is expectation of covariance of derivative of log likelihood. As we expected, we see clearer definition in diagonal relationship as \(M\) increases.

M = 1

M = 10

M = 100
Figure 3: Change in FIM[:256, :256] of single data pair \(\{K, S^t(K)\}^8_{t=1}\) as number of observation, \(M\) increases

Making Sense of FIM obtained

Still, does our FIM make sense? How can we better understand what FIM is representing?

Let’s look at the first row of the FIM and reshape it to [64, 64].

FIM[0,:]

FIM[1,:]

FIM[2,:]
Figure 4: Fist, Second, and Third row in FIM
  • Like we expected from the definition of FIM, we observe each plot is just different linear transformation of \(\nabla log p(\{S^t\}^8_{t=1}|K)\)
  • As we will see from below, each rows in FIM is noisy version of its eigenvector.

How does eigenvectors of FIM look like as \(M\) increases?

\(M = 1\) (Single Observation)

First Eigenvector

Second Eigenvector

Third Eigenvector
Figure 5: First three largest eigenvector of FIM
  • Even when FIM is computed with single observation, we see that the largest eigenvector has the most definition in the shape of permeability. Rest of eigenvector looks more like noise.

\(M = 10\)

First Eigenvector

Second Eigenvector

Third Eigenvector
Figure 6: First three largest eigenvector of FIM

\(M = 100\)

First Eigenvector

Second Eigenvector

Third Eigenvector
Figure 7: First three largest eigenvector of FIM

\(M = 1000\)

First Eigenvector

Second Eigenvector

Third Eigenvector
Figure 8: First three largest eigenvector of FIM
  • As \(M\) increases, we observe flow through the channel clearer.
  • We see the boundary of permeability gets clearer.
  • In general, it gets less noisy.

How does vector Jacobian product look like as \(M\) increases?

vjp (\(M=1\))

vjp (\(M=10\))

vjp (\(M=100\))

vjp (\(M=1000\))
Figure 9: Normalized Vector Jacobian Product when vector is the largest eigenvector
  • We observe that vector Jacobian product looks more like saturation rather than permeability.
  • As \(M\) increases, scale in color bar also increases.
  • One possible conclusion:
    • vjp tells us the location in the spatial distribution (likelihood space) where there exists the largest variation, thus have the most information on parameter.
    • \(J^Tv\), when \(v\) is the largest eigenvector of FIM, is projecting Jacobian onto direction of maximum sensitivity.

Incompressible Navier Stokes

Dataset

Vorticity at \(t=0\)

Vorticity at \(t=40\)
Figure 10: The first and the last vorticity in a single time series

Our dataset consists of 50 pairs of \(\{\varphi^{t-1}(x_0), \varphi^t(x_0)\}^T_{t=1}\), where \(T=44\). Initial vorticities are a Gaussian Random Fields.

Fisher Information Matrix

How do we compute FIM?

\(FIM = \left(\nabla log p( \varphi^t(x_0) | \varphi^0(x_0))\right)\left(\nabla log p( \varphi^t(x_0) | \varphi^0(x_0))\right)^T\)

  • Just means that we are computing FIM with respect to the initial vorticity, \(\varphi^t(x_0)\).

How does FIM looks like as \(M\) changes?

\(M=10\)

\(M=100\)
Figure 11: FIM[:100, :100] of varying \(M\)

Making Sense of FIM obtained

Still, does our FIM make sense? How can we better understand what FIM is representing?

Let’s look at the first row of the Fisher Information Matrix and reshape it to [64,64].

FIM[0, :]

Input Vorticity
Figure 12: Comparison of the input parameter with the first element of FIM

Also, let’s look at how the first row of the FIM changes as time evolves. When \(M=10\),

\(t=1\)

\(t=5\)

\(t=10\)

\(t=15\)

\(t=20\)

\(t=25\)

\(t=30\)

\(t=35\)

\(t=40\)

\(t=44\)
Figure 13: The evolution of the first row of FIM

Single FIM for single data points in a single time series

A single time series, \(\{\varphi^{t}(x_0)\}^T_{t=1}\), consists of multiple data points.

Given such time series, Example Time Series

When \(M=10\), we look at the first row of FIM.

\(t=1\)

\(t=2\)

\(t=3\)

\(t=4\)

\(t=5\)

\(t=6\)

\(t=7\)

\(t=8\)

\(t=9\)

\(t=10\)
Figure 14: The evolution of the first row of FIM

Single FIM for single time series

Future Step

  1. TODO: Debug NS eigenvector and vjp.
  2. TODO: Want to generate the full dataset for Francis’ dataset (which might take 1 or 2 days).
  3. TODO: Try it on Jason’s dataset (Now that we fixed the problem with FIM computation, we are optimistic about the experiment, so we want to try it again.)

Question

  1. What would be the optimal number for observations, \(M\) when computing Fisher Information Matrix?